Soil fertility impact on recruitment and diversity of the soil microbiome in sub-humid tropical pastures in Northeastern Brazil

Soil fertility is key point to pastures systems and drives the microbial communities and their functionality. Therefore, an understanding of the interaction between soil fertility and microbial communities can increase our ability to manage pasturelands and maintain their soil functioning and productivity. This study probed the influence of soil fertility on microbial communities in tropical pastures in Brazil. Soil samples, gathered from the top 20 cm of twelve distinct areas with diverse fertility levels, were analyzed via 16S rRNA sequencing. The soils were subsequently classified into two categories, namely high fertility (HF) and low fertility (LF), using the K-Means clustering. The random forest analysis revealed that high fertility (HF) soils had more bacterial diversity, predominantly Proteobacteria, Nitrospira, Chloroflexi, and Bacteroidetes, while Acidobacteria increased in low fertility (LF) soils. High fertility (HF) soils exhibited more complex network interactions and an enrichment of nitrogen-cycling bacterial groups. Additionally, functional annotation based on 16S rRNA varied between clusters. Microbial groups in HF soil demonstrated enhanced functions such as nitrate reduction, aerobic ammonia oxidation, and aromatic compound degradation. In contrast, in the LF soil, the predominant processes were ureolysis, cellulolysis, methanol oxidation, and methanotrophy. Our findings expand our knowledge about how soil fertility drives bacterial communities in pastures.


SUPPLEMENTARY TABLES
Table S1.Statistical summary of the chemical attributes of HF and LF clusters.

Measure
HF LF p-value (a)   mean sd (b)    Table S3.Correlation analysis between chemical attributes and the two main axes of PCA.  a)  mean sd (a)  p-value (b) Observed (c)  910.9 120.8 781.6  Table S5.Main metrics of the microbial co-occurrence network in pasture clusters.
(e) Network modules represent important ecological units, which can have significant implications for biological or ecological functions [1]. (f) Network diameter is the longer of the pairwise shortest path lengths or the size of the largest connected component, being quantified by the average number of edges.The diameter may or may not be correlated with the edge number [2, 3]. (g) Average degree or node connectivity is number of your direct connections to other nodes [2]. (h) Average path length or average shortest path length is an indicator of system performance or the degree of compaction of the microbial structure.In other words, it indicates the average number of steps required to get from one node to another in the network [4]. (i) Betweenness centrality, average importance of nodes in the network, and all modes, being higher in nodes connected to more influential (more connected) neighbors.Therefore, it measures the importance of nodes by their frequency of occurrence on paths connecting other nodes [3,5]. (j) The degree to which nodes in a graph tend to cluster, defined as the average proportion of connections between neighbors that are made and the number of all possible connections.In other words, it quantifies the tendency of the graph to be divided into subunits.Values >0.4 suggest that the network has a modular structure [6].High modularity signifies densely connected nodes within certain groups and sparse inter-group connections in the network [5]. (k) Measure of the integrity and effectiveness of the network, it is the observed fraction of real connections by possible collections.Density represents the ratio of observed microbial associations to all potential associations, given the network's nodes [3]. (l) Proportional to the modularity of the habitat, it quantifies the number of distinct modular structures [7].

(
a) P-values highlighted in bold were considered significant (p < 0.05) according to factorial analysis.(b)Cation exchange capacity. (c) Total organic carbon.

Figure S2 .
Figure S2.Spatial location and vegetation status of the sample sites.The coordinates of the six sites located in the Agreste (a) and Mata (b) zones were plotted on maps of the Normalized Difference Vegetation Index (NDVI).Each site was composed of two pastures that differed in NDVI (c) and nitrogen concentration in the aerial part of the pastures (d), according to the comparison between means by the t-test (p < 0.05), with these values being associated with soil chemical attributes and climatic conditions of each location.For comparative purposes, the pastures were named P1 (less productive) and P2 (more productive), each one being formed by pastures from all sites.

Figure S3 .
Figure S3.Dispersion of edaphic variables in pastures with high (HF) and low (LF) fertility levels.Comparisons with more than one asterisk (*) indicate that HF and LF clusters had distinct means and were representative of populations with distinct distributions, according to t-tests (parametric) and Wilcoxon signed-rank tests (nonparametric), respectively, both at the 5% significance level.

Figure S4 .
Figure S4.Associations between metagenomic prediction based on the 16S rRNA gene and annotation based on prokaryotic origin contigs obtained from Shotgun metagenomic sequencing were examined.The relative frequency data were incremented by a small value (10 -10 ) to eliminate null counts and then transformed by the expression 1/-log10(x).The Spearman correlation coefficient was used to test the degree of correlation between the data.

Table S2 .
Localization and climatic characteristics of the local sampled in the AgresteMeridional and Zona da Mata Mesoregions, Pernambuco state, Brazil.

Table S4 .
Statistical summary of microbial alpha-diversity indices of HF and HL pastures.

Table S6 .
Relative composition (%) of phyla in the main modules of co-occurrence networks.